智能论文笔记

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Dae Woong Ham , Kosuke Imai , Lucas Janson

分类： (统计)机器学习

2022-01-20

联合分析是一种流行的实验设计，用于测量多维偏好。研究人员研究了在控制其他相关因素的同时如何影响决策。当前，存在两种方法学方法来分析联合实验的数据。第一个重点是估计每个因素的平均边际效应，同时平均其他因素。尽管这允许基于直接设计的估计，但结果严重取决于其他因素的分布以及相互作用效应的汇总方式。一种基于模型的替代方法可以计算各种兴趣，但要求研究人员正确指定模型，这是与许多因素和可能的相互作用的联合分析的挑战性任务。此外，在合并相互作用时，常用的逻辑回归即使具有适度的因素，统计特性也很差。我们提出了一种基于条件随机测试的新假设检验方法，以回答联合分析的最基本问题：考虑到其他因素，感兴趣的因素是否重要？我们的方法仅基于因素的随机化，因此没有假设。但是，它允许研究人员使用任何测试统计量，包括基于复杂的机器学习算法的统计量。结果，我们能够结合现有的基于设计和基于模型的方法的优势。我们通过对移民偏好和政治候选评估的联合分析来说明拟议的方法。我们还扩展了提出的方法来测试联合分析中常用的规律性假设。可以使用开源软件包来实施建议的方法。

translated by 谷歌翻译

Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

Sang-Woo Lee , Sungdong Kim , Donghyeon Ko , Donghoon Ham , Youngki Hong , Shin Ah Oh , Hyunhoon Jung , Wangkyo Jung , Kyunghyun Cho , Donghyun Kwak

分类：自然语言处理

2022-12-20

Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-world scenarios and that the current TOD models are still a long way from unraveling the scenarios. In this position paper, we first identify current status and limitations of SF-TOD systems. After that, we explore the WebTOD framework, the alternative direction for building a scalable TOD system when a web/mobile interface is available. In WebTOD, the dialogue system learns how to understand the web/mobile interface that the human agent interacts with, powered by a large-scale language model.

translated by 谷歌翻译

Improving group robustness under noisy labels using predictive uncertainty

Dongpin Oh , Dae Lee , Jeunghyun Byun , Bonggun Shin

分类：机器学习 | 计算机视觉

2022-12-14

The standard empirical risk minimization (ERM) can underperform on certain minority groups (i.e., waterbirds in lands or landbirds in water) due to the spurious correlation between the input and its label. Several studies have improved the worst-group accuracy by focusing on the high-loss samples. The hypothesis behind this is that such high-loss samples are \textit{spurious-cue-free} (SCF) samples. However, these approaches can be problematic since the high-loss samples may also be samples with noisy labels in the real-world scenarios. To resolve this issue, we utilize the predictive uncertainty of a model to improve the worst-group accuracy under noisy labels. To motivate this, we theoretically show that the high-uncertainty samples are the SCF samples in the binary classification problem. This theoretical result implies that the predictive uncertainty is an adequate indicator to identify SCF samples in a noisy label setting. Motivated from this, we propose a novel ENtropy based Debiasing (END) framework that prevents models from learning the spurious cues while being robust to the noisy labels. In the END framework, we first train the \textit{identification model} to obtain the SCF samples from a training set using its predictive uncertainty. Then, another model is trained on the dataset augmented with an oversampled SCF set. The experimental results show that our END framework outperforms other strong baselines on several real-world benchmarks that consider both the noisy labels and the spurious-cues.

translated by 谷歌翻译

Moving from 2D to 3D: volumetric medical image classification for rectal cancer staging

Joohyung Lee , Jieun Oh , Inkyu Shin , You-sung Kim , Dae Kyung Sohn , Tae-sung Kim , In So Kweon

分类：计算机视觉

2022-09-13

来自磁共振成像（MRI）的体积图像在直肠癌的术前分期提供了宝贵的信息。最重要的是，T2和T3阶段之间的准确术前歧视可以说是直肠癌治疗的最具挑战性和临床意义的任务，因为通常建议对T3（或更大）阶段癌症患者进行化学疗法。在这项研究中，我们提出了一个体积卷积神经网络，可准确区分T2与直肠MR体积的T3阶段直肠癌。具体而言，我们提出1）基于自定义的基于重新连接的卷编码器，该编码器与晚期融合的固定间关系建模（即最后一层的3D卷积），2）双线性计算，该计算汇总了编码器所得的功能以创建一个创建一个的功能体积特征和3）三重损失和焦点损失的关节最小化。通过病理确认的T2/T3直肠癌的MR量，我们进行了广泛的实验，以比较残留学习框架内的各种设计。结果，我们的网络达到了0.831的AUC，高于专业放射科医生组的准确性。我们认为该方法可以扩展到其他卷分析任务

translated by 谷歌翻译

L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training

Jonghyun Bae , Woohyeon Baek , Tae Jun Ham , Jae W. Lee

分类：计算机视觉

2022-08-18

深度神经网络（DNN）的训练过程通常是用阶段进行管道的，用于在CPU上进行数据制备，然后对GPU等加速器进行梯度计算。在理想的管道中，端到端训练吞吐量最终受到加速器的吞吐量的限制，而不是数据准备。过去，DNN训练管道通过使用使用轻巧，有损的图像格式（如JPEG）编码的数据集实现了近乎最佳的吞吐量。但是，随着高分辨率，无损编码的数据集变得越来越流行，对于需要高精度的应用程序，由于CPU上的低通量图像解码，在数据准备阶段出现了性能问题。因此，我们提出了L3，这是一种用于高分辨率，高通量DNN训练的定制轻巧，无损的图像格式。 L3的解码过程在加速器上有效平行，从而最大程度地减少了在DNN培训期间进行数据制备的CPU干预。 L3比最流行的无损图像格式PNG获得了9.29倍的数据准备吞吐量，用于NVIDIA A100 GPU上的CityScapes数据集，该数据集可导致1.71倍更高的端到端训练吞吐量。与JPEG和WebP相比，两种流行的有损图像格式，L3分别以同等的度量性能为Imagenet提供高达1.77倍和2.87倍的端到端训练吞吐量。

translated by 谷歌翻译

Deep learning-based denoising for fast time-resolved flame emission spectroscopy in high-pressure combustion environment

Taekeun Yoon , Seon Woong Kim , Hosung Byun , Younsik Kim , Campbell D. Carter , Hyungrok Do

分类：机器学习

2022-07-29

A deep learning strategy is developed for fast and accurate gas property measurements using flame emission spectroscopy (FES). Particularly, the short-gated fast FES is essential to resolve fast-evolving combustion behaviors. However, as the exposure time for capturing the flame emission spectrum gets shorter, the signal-to-noise ratio (SNR) decreases, and characteristic spectral features indicating the gas properties become relatively weaker. Then, the property estimation based on the short-gated spectrum is difficult and inaccurate. Denoising convolutional neural networks (CNN) can enhance the SNR of the short-gated spectrum. A new CNN architecture including a reversible down- and up-sampling (DU) operator and a loss function based on proper orthogonal decomposition (POD) coefficients is proposed. For training and testing the CNN, flame chemiluminescence spectra were captured from a stable methane-air flat flame using a portable spectrometer (spectral range: 250 - 850 nm, resolution: 0.5 nm) with varied equivalence ratio (0.8 - 1.2), pressure (1 - 10 bar), and exposure time (0.05, 0.2, 0.4, and 2 s). The long exposure (2 s) spectra were used as the ground truth when training the denoising CNN. A kriging model with POD is trained by the long-gated spectra for calibration, and then the prediction of the gas properties taking the denoised short-gated spectrum as the input: The property prediction errors of pressure and equivalence ratio were remarkably lowered in spite of the low SNR attendant with reduced exposure.

translated by 谷歌翻译

Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation

Geon Lee , Chanho Eom , Wonkyung Lee , Hyekang Park , Bumsub Ham

分类：计算机视觉

2022-07-22

我们提出了一种用于语义分割的新型无监督域适应方法，该方法将训练的模型概括为源图像和相应的地面真相标签到目标域。域自适应语义分割的关键是学习域，不变和判别特征，而无需目标地面真相标签。为此，我们提出了一个双向像素 - 型对比型学习框架，该框架可最大程度地减少同一对象类特征的类内变化，同时无论域，无论域如何，都可以最大程度地提高不同阶层的阶层变化。具体而言，我们的框架将像素级特征与目标和源图像中同一对象类的原型保持一致（即分别为正面对），将它们设置为不同的类别（即负对），并执行对齐和分离在源图像中具有像素级特征的另一个方向的过程，目标图像中的原型。跨域匹配鼓励域不变特征表示，而双向像素 - 型对应对应关系汇总了同一对象类的特征，提供了歧视性特征。为了建立对比度学习的训练对，我们建议使用非参数标签转移（即跨不同域的像素 - 型对应关系，就可以生成目标图像的动态伪标签。我们还提出了一种校准方法，以补偿训练过程中逐渐补偿原型的阶级域偏差。

translated by 谷歌翻译

OIMNet++: Prototypical Normalization and Localization-aware Learning for Person Search

Sanghoon Lee , Youngmin Oh , Donghyeon Baek , Junghyup Lee , Bumsub Ham

分类：计算机视觉

2022-07-21

我们解决了人搜索的任务，即从一组原始场景图像中进行本地化和重新识别查询人员。最近的方法通常是基于Oimnet（在人搜索上的先驱工作）建立的，该作品学习了执行检测和人重新识别（REID）任务的联合人物代表。为了获得表示形式，它们从行人提案中提取特征，然后将其投射到具有L2归一化的单位超晶体上。这些方法还结合了所有积极的建议，这些建议与地面真理充分重叠，同样可以学习REID的人代表。我们发现1）L2归一化而不考虑特征分布会退化人的判别能力，而2）正面建议通常也描绘了背景混乱和人的重叠，这可能会将嘈杂的特征编码为人的表示。在本文中，我们介绍了解决上述局限性的Oimnet ++。为此，我们引入了一个新颖的归一化层，称为Protonorm，该层校准了行人建议的特征，同时考虑了人ID的长尾分布，使L2归一化的人表示具有歧视性。我们还提出了一种本地化感知的特征学习计划，该方案鼓励更好地调整的建议在学习歧视性表示方面做出更多的贡献。对标准人员搜索基准的实验结果和分析证明了Oimnet ++的有效性。

translated by 谷歌翻译

Training Patch Analysis and Mining Skills for Image Restoration Deep Neural Networks

Jae Woong Soh , Nam Ik Cho

分类：计算机视觉

2022-07-03

有许多基于深卷卷神经网络（CNN）的图像恢复方法。但是，有关该主题的大多数文献都集中在网络体系结构和损失功能上，而对培训方法的详细介绍。因此，某些作品不容易重现，因为需要了解隐藏的培训技巧才能获得相同的结果。要具体说明培训数据集，很少有作品讨论了如何准备和订购培训图像补丁。此外，捕获新数据集以训练现实世界中的恢复网络需要高昂的成本。因此，我们认为有必要研究培训数据的准备和选择。在这方面，我们对训练贴片进行了分析，并探讨了不同斑块提取方法的后果。最终，我们提出了从给定训练图像中提取补丁的指南。

translated by 谷歌翻译

Variational Deep Image Restoration

Jae Woong Soh , Nam Ik Cho

分类：计算机视觉

2022-07-03

本文提出了图像恢复的新变异推理框架和一个卷积神经网络（CNN）结构，该结构可以解决所提出的框架所描述的恢复问题。较早的基于CNN的图像恢复方法主要集中在网络体系结构设计或培训策略上，具有非盲方案，其中已知或假定降解模型。为了更接近现实世界的应用程序，CNN还接受了整个数据集的盲目培训，包括各种降解。然而，给定有多样化的图像的高质量图像的条件分布太复杂了，无法通过单个CNN学习。因此，也有一些方法可以提供其他先验信息来培训CNN。与以前的方法不同，我们更多地专注于基于贝叶斯观点以及如何重新重新重构目标的恢复目标。具体而言，我们的方法放松了原始的后推理问题，以更好地管理子问题，因此表现得像分裂和互动方案。结果，与以前的框架相比，提出的框架提高了几个恢复问题的性能。具体而言，我们的方法在高斯denoising，现实世界中的降噪，盲图超级分辨率和JPEG压缩伪像减少方面提供了最先进的性能。

translated by 谷歌翻译